Microprocessor equipped with an arithmetic and logic unit and with a hardware security module

ABSTRACT

This microprocessor is configured to compute a code C 1 , used to detect an execution fault, using a relationship C i =P o F α (D i ), where:
         F α (D i )=E 0  o . . . o E q  o . . . o E NbE−1 (D i ),   E q (x)=T αm,q  o . . . o T αj,q  o . . . o T α1,q  o T α0,q (X), and   T αj,q  is a conditional transposition, configured by a secret parameter α j,q , that permutes two blocks of bits B 2j+1,q  and B 2j,q  of the variable x only when the parameter a j,q  is equal to a first value, the blocks B 2j+1,q  and B 2j,q  of all of the transpositions T αj,q  of the stage E q  being different from one another and not overlapping and the blocks B 2j+1,q  and B 2j,q  are placed within one and the same block of greater size permuted by a transposition of the higher stage E q+1 .

The invention relates to a microprocessor equipped with an arithmeticand logic unit and with a hardware security module.

Numerous attacks are possible in order to obtain information about abinary code or to cause unexpected operation of the binary code. Forexample, attacks known under the name “fault injection” or “faultattack” may be implemented. These attacks involve disrupting theoperation of the microprocessor or the memory containing the binarycode, using various physical means such as modifying supply voltages,modifying the clock signal, exposing the microprocessor toelectromagnetic waves, inter alia.

Using such attacks, an attacker is able to alter the integrity ofmachine instructions or data in order for example to recover a secretkey of a cryptographic system, bypass security mechanisms such asverification of a PIN code during authentication, or simply prevent theexecution of a function essential to the security of a critical system.

These attacks may notably cause three types of fault, called executionfaults, when the binary code is executed:

1)altering the instructions of the machine code that is executed,

2) altering the data stored in the main memory or in registers of themicroprocessor, and

3) altering the control flow of the machine code.

he control flow corresponds to the execution path that is followed whenthe machine code is executed. The control flow is conventionallydepicted in the form of a graph, known under the name “control flowgraph”.

To detect such execution faults, there has already been the proposal toassociate an error correction code with each data item processed by themicroprocessor. Next, the error correction code associated with theresult of the instruction that processes these data is computed from theerror correction codes of the processed data. In this way, if a faultoccurs when this instruction is executed, the result obtained does notcorrespond to the computed error correction code. This allows this faultto be detected. Such a solution is for example disclosed in applicationFR3071082. The algorithm for constructing the error correction codeassociated with a data item is known. It is therefore possible for anattacker to inject faults in order to modify the error correction codecomputed for the result so that it corresponds to the faulted result. Inthis case, the execution fault is not detected.

To overcome the above disadvantage, it has been proposed that the errorcorrection code be replaced by an integrity code. This integrity code isconstructed from the data item and, in addition, using a secret keyknown only to the microprocessor. It is thus difficult for an attackerto modify an integrity code so that it corresponds to a faulted result,because he does not know the secret key. However, it should always bepossible to construct the integrity code for the result using theintegrity codes associated with the processed data and without using theresult of the instruction executed by the arithmetic and logic unit. Forexample, such a solution is described in the following article: L. DeMeyer, V. Arribas, S. Nikova, V. Nikov and V. Rijmen: “M&M: Masks andMacs against physical attacks”, IACR Transactions on CryptographicHardware and Embedded Systems, pages 25-50, 2019. This article issubsequently denoted by the term “DEMEYER2019”. The method described inthis article for computing the integrity code for the result from theintegrity codes of the processed data is complex. The reason is that itdoes this by using multiplications in a Galois field. Hardware circuitsthat rapidly compute multiplications in a Galois field are complex andslow. The method of DEMEYER2019 is therefore difficult to implement in amicroprocessor, including in the case of Boolean operations.

Prior art is also known from EP3457620A1 and from the following article:Savry Olivier et al.: “Confidaent: Control Flow protection withInstruction and Data Authenticated Encryption”, 2020 23 RD EuromicroConference On Digital System Design, 26/08/2020, pages 246-253.

The objective is to propose a microprocessor that has the same level ofsecurity for at least one executed arithmetic and logic operation asthat described in the article of DEMEYER2019 but that is easier toproduce. [oolo] The invention therefore relates to such amicroprocessor.

The invention will be better understood on reading the description thatfollows, which is given solely by way of non-limiting example, withreference to the drawings, in which:

FIG. 1 is a schematic illustration of the architecture of an electronicapparatus capable of executing a binary code;

FIG. 2 is a schematic illustration of the structure of a register of theapparatus of FIG. 1,

FIG. 3 is a schematic illustration of a function F_(α) executed by theapparatus of FIG. 1;

FIG. 4 is a flowchart of a method for executing the binary code by meansof the apparatus of FIG. 1,

FIG. 5 is a schematic illustration of a computation circuit forcomputing a code C_(res−t) in the case of an arithmetic shift operationfor shifting one bit to the left,

FIGS. 6 and 7 are more detailed schematic illustrations of twocomponents of the circuit of FIG. 5,

FIGS. 8 to 10 are schematic illustrations of various operating states ofthe circuit of FIG. 5,

FIG. 11 is a schematic illustration of a computation circuit forcomputing a code C_(res−t) in the case of an arithmetic shift operationfor shifting 2^(r) bits to the left,

FIG. 12 is a schematic illustration of a computation circuit forcomputing a code C_(res−t) in the case of an arithmetic shift operationfor shifting α_(NbE−1)2^(NbE−)1+ . . . +α_(q)2^(q)+ . . . +α₀ bits tothe left,

FIG. 13 is a schematic illustration of a computation circuit forcomputing a code C_(res−t) in the case of an arithmetic rotationoperation for rotating one bit to the left,

FIG. 14 is a schematic illustration of a computation circuit forcomputing a code C_(res−t) in the case of an arithmetic additionoperation, and

FIGS. 15 and 16 are more detailed schematic illustrations of twocomponents of the circuit of FIG. 14.

SECTION I: CONVENTIONS, NOTATIONS AND DEFINITIONS

In the figures, the same references have been used to designate elementsthat are the same. In the rest of this description, features andfunctions that are well known to those skilled in the art will not bedescribed in detail.

In this description, the following definitions have been adopted.

A “program” designates a set of one or more predetermined functions thatit is desired to have executed by a microprocessor.

A “source code” is a representation of the program in a computerlanguage, not being able to be executed directly by a microprocessor andbeing intended to be converted, by a compiler, into a machine code ableto be executed directly by the microprocessor.

A program or a code is said to be “able to be executed directly” or“directly executable” when it is able to be executed by a microprocessorwithout this microprocessor needing to compile it beforehand by way of acompiler or to interpret it by way of an interpreter.

An “instruction” denotes a machine instruction able to be executed by amicroprocessor. Such an instruction consists:

of an opcode, or operation code, that codes the nature of the operationto be executed, and

of one or more operands defining the value(s) of the parameters of thisoperation.

The registers in which the data item or data to be processed by aninstruction are stored are typically identified by one or more operandsof the instruction. Likewise, the register R_(res−p) in which the resultD_(res−p) of the execution of an instruction needs to be stored can alsobe identified by an operand of this instruction.

“Logic instruction” is used to denote an instruction from the set ofinstructions of the microprocessor 2 that, when executed by thearithmetic and logic unit, stores the result of a Boolean operation in aregister R_(res−p) of the microprocessor. The opcode of the logicinstruction identifies the Boolean operation to be executed by thearithmetic and logic unit in order to modify or combine the data item ordata D₁ to D_(n).

The “&” symbol is used below to generically denote a Boolean operation.Thus, the notation D₁&D₂& . . . &D_(n) generically denotes a Booleanoperation executed by the microprocessor 2 between the data D₁ to D_(n).When n =1, the Boolean operation is the complement operation also knownby the name “NOT”. When n is greater than or equal to two, the Booleanoperation is chosen from the group made up of the following Booleanoperations and their composition:

the “OR” logic operation,

the “EXCLUSIVE-OR” logic operation,

the “AND” logic operation.

The following notations are used to denote Boolean operations:

the “OR” logic operation is denoted by the symbol “ + ”,

the “EXCLUSIVE-OR” logic operation is denoted by the symbol “XOR”,

the “AND” logic operation is denoted by the symbol “ . ”,

the “NOT” Boolean operation is denoted by the symbol “ ′ ” placed afterthe variable for which the complement is computed.

“Arithmetic instruction” is used to denote an instruction from the setof instructions of the microprocessor 2 that, when executed by the unit10, stores the result of an arithmetic operation in a register R_(res−p)of the microprocessor. An arithmetic operation is different from aBoolean operation. An arithmetic operation typically belongs to thegroup made up of bit shift operations, bit rotation operations, additionoperations, multiplication operations and division operations.

“Arithmetic and logic instruction” denotes both a logic instruction andan arithmetic instruction. Unless indicated otherwise, the term“instruction” denotes an arithmetic and logic instruction below.

A “machine code” is a set of machine instructions. It typically is afile containing a sequence of bits with the value “0” or “1”, these bitscoding the instructions to be executed by the microprocessor. Themachine code is able to be executed directly by the microprocessor, thatis to say without the need for a preliminary compilation orinterpretation.

A “binary code” is a file containing a sequence of bits bearing thevalue “0” or “1”. These bits code data and instructions to be executedby the microprocessor. The binary code thus comprises at least onemachine code and also, in general, digital data processed by thismachine code.

The expression “execution of a function” is understood to designateexecution of the instructions making up this function.

A block of bits of a data item or of a variable is a group ofconsecutive bits of this data item or of this variable. The size of ablock of bits is equal to the number of bits contained in this block.

Section II: Architecture of the Apparatus

FIG. 1 shows an electronic apparatus 1 comprising a microprocessor 2, amain memory 4 and a mass storage medium 6. For example, the apparatus 1is a computer, a smartphone, an electronic tablet, a chip card or thelike.

The microprocessor 2 here comprises:

an arithmetic and logic unit 10;

a set 12 of registers;

a control module 14;

a data input/output interface 16,

an instruction loader 18 having a program counter 26,

a queue 22 of instructions to be executed, and

a hardware security module 28.

The memory 4 is configured so as to store instructions of a binary code30 of a program to be executed by the microprocessor 2. The memory 4 isa random access memory. The memory 4 is typically a volatile memory. Thememory 4 may be a memory external to the microprocessor 2, as shown inFIG. 1. In this case, the memory 4 is formed on a substrate that ismechanically separate from the substrate on which the various elementsof the microprocessor 2, such as the unit 10, are formed.

By way of illustration, the binary code 30 notably comprises a machinecode 32 of a secure function. Each secure function corresponds to a setof several lines of code, for example several hundred or thousand linesof code, stored at successive addresses in the memory 4. Each line ofcode corresponds here to a machine word. A line of code is thus loadedinto a register of the microprocessor 2 in a single read operation.Likewise, a line of code is written to the memory 4 by themicroprocessor 2 in a single write operation. Each line of code codeseither a single instruction or a single data item.

By way of illustration, the microprocessor 2 is a reduced instructionset computer more commonly known by the acronym RISC.

The loader 18 loads the next instruction to be executed by the unit 10into the queue 22 from the memory 4. More precisely, the loader 18 loadsthe instruction to which the program counter 26 points. To this end, thequeue 22 comprises a succession of multiple registers.

The unit 10 is notably configured to execute one after another theinstructions loaded into the queue 22. The instructions loaded into thequeue 22 are generally consistently executed in the order in which theseinstructions were stored in this queue 22. The unit 10 is also capableof storing the result of these executed instructions in one or more ofthe registers of the set 12.

In this description, “execution by the microprocessor 2” and “executionby the unit 10” will be used synonymously.

The module 14 is configured to move data between the set 12 of registersand the interface 16. The interface 16 is notably able to acquire dataand instructions, for example from the memory 4 and/or the medium 6 thatare external to the microprocessor 2. To speed up transfers of data andinstructions between the microprocessor 2 and the memory 4 here, theinterface 16 comprises one or more cache memories. To simplify FIG. 1,only one cache memory 27 is shown. This cache memory 27 is used totemporarily store the data processed by the microprocessor 2 on the samechip as the unit 10.

The module 28 is capable of automatically executing the variousoperations described in detail in the sections that follow in order tomake the execution of the arithmetic and logic instructions by the unit10 secure. The module 28 operates independently and without using theunit 10. It is thus capable of processing the lines of code beforeand/or after they are processed by the unit 10. To this end, it notablycomprises a secure nonvolatile memory 29 and various hardwarecomputation circuits. This memory 29 can only be accessed via the module28. In this embodiment, the module 28 is configured to executeoperations such as the following operations:

verifying an integrity code,

constructing an integrity code from a data item,

constructing the integrity code for a result from the integrity codes ofthe processed data.

Each computation circuit constructs an integrity code C_(res−t) for aresult D_(res−p) from the integrity codes C₁ to C_(n) of the data D₁ toD_(n) processed by the unit 10 and 30 without directly using the resultD_(res−p). Here, there is a computation circuit 110 for the logicinstructions. There is also a computation circuit for each arithmeticinstruction whose execution needs to be made secure. Here, the module 28comprises five computation circuits 120, 130, 140, 149 and 160, eachassociated with a respective arithmetic instruction. These computationcircuits are described in more detail in section IV that follows.

The memory 29 is used to store the secret information required for theoperation of the module 28. Here, it therefore notably comprises apre-stored secret key α.

In this example of embodiment, the set 12 comprises general registersthat can be used to store any type of data. The size of each of theseregisters is sufficient to store a data item or a result and theintegrity code associated therewith.

A data interchange bus 24 that connects the various components of themicroprocessor 2 to one another is shown in FIG. 1 in order to indicatethat the various components of the microprocessor are able tointerchange data with one another.

The medium 6 is typically a non-volatile memory. It is for example anEEPROM or flash memory. Here, it contains a backup copy 40 of the binarycode 30. It is typically this copy 40 that is automatically copied tothe memory 4 to restore the code 30, for example after a power failureor the like or just before the execution of the code 30 starts.

Section III—Making Arithmetic and Logic Instructions Secure

By injecting faults while the unit 10 is operating, it is possible todisrupt its operation so that the result of the execution of thearithmetic and logic instruction does not correspond to that expected.The unit 10 is then said to have been caused to malfunction. Thissection describes a solution for detecting such a malfunction of theunit 10.

The registers R₁ to R_(n) denote the registers of the set 12 comprisingthe data D₁ to D_(n), respectively, to be processed when the instructionis executed. The register R_(res−p) denotes the register of the set 12in which the result, of the execution of the arithmetic and logicinstruction, is stored.

The size, in terms of the number of bits, of each data item D₁, D₂ andD_(res−p) is equal to 2 ^(d), where d is a whole number typicallygreater than four or five.

The structures of the registers R₁, R₂ and R_(res−p) are identical andshown in the specific case of the register R_(i) in FIG. 2. The registerR_(i) comprises:

a bit range containing the data item D_(i),

a range containing an integrity code C_(i) allowing the integrity andthe authenticity of the data item D_(i) to be checked.

The code C_(i) is generated by the module 28 using a pre-programmedrelationship defined generically by the following relationship:C_(i)=Q_(α)(D_(i)), where:

the subscript i identifies a register among the registers R₁, R₂ andR_(res−p), and

the function Q_(α) is a function pre-programmed in the module 28 andconfigured by the secret key α.

The function Q_(α) is defined by the following relationship:Q_(a)(D_(i))=P o F_(α)(D_(i)), where the symbol “o” denotes thefunction-composition operation. The function P is a predeterminedfunction. In the first embodiments described below, the function P isthe identity function. Thus, in these first embodiments, the functionQ_(α) is equal to the function F_(α). Examples where the function P isdifferent from the identity function are given in the section dealingwith variants.

The function F_(α) is a homomorphism of a set A equipped with the “&”Boolean operation towards a set B equipped with the same “&” Booleanoperation such that F_(α)(D₁&D₂)=F_(α)(D₁) & F_(α)(D₂), and such is thecase for all “&” Boolean operations. Here, the sets A and B are each theset of numbers that can be coded over 2 ^(d) bits, that is to say theset of possible data D₁ and D₂. Thus, using the notations introducedearlier, the function F_(α) is such that for any & Boolean operation,the circuit 110 simply computes the integrity code C_(res−t) associatedwith the result D_(res−p) of the Boolean operation D₁ & D₂ using thefollowing relationship C_(res−t)=C₁ & C₂. When the Boolean operationexecuted is the complement operation of the data item D₁, the circuit110 computes the code C_(res−t) associated with the result D_(res−p)using the following relationship C_(res−t)=C₁′, where the symbol “ ′ ”denotes the complement operation that returns a “1” when D₁=0 and thatreturns “0” when D₁=1.

FIG. 3 shows the function F_(α) used in this embodiment. The functionF_(α) is defined by the following relationship: F_(α)(D_(i))=E₀ o . . .o E_(q) o . . . o _(ENbE−)1(D_(i)), where:

each function E_(q) is a stage of transpositions that can be executed inparallel,

NbE is the number of stages of transpositions, and

the subscript q is an order number between zero and NbE−1.

The number NbE is greater than one and less than or equal to d.Preferably, the number NbE is equal to d. In FIG. 3, d=5 and NbE=5.

Each stage E_(q) of transpositions is defined by the followingrelationship: E_(q)(x)=T_(αm,q) o . . . o T_(αj,q) o . . . o T_(α1,q) oT_(αo,q)(X), where:

x is a variable whose size, in terms of the number of bits, is equal tothe size of the data item D_(i),

T_(αj,q) is a conditional transposition, configured by the parameterα_(j,q), that permutes two blocks of bits B_(2j+1,q) and B_(2j,q) of thevariable x when the parameter α_(j,q) is equal to “1” and that does notpermute these two blocks of bits when the parameter α_(j,q) is equal to“0”,

“m+1” is the total number of transpositions T_(αj,q) of the stage E_(q),

“j” is an order number identifying the transposition T_(αj,q) among theother transpositions of the stage E_(q). The subscript “j” thereforealso identifies the position of the blocks B_(2j+1,q) and B_(2j,q) inthe variable x. In this application, the blocks are classified inascending order of their subscript, which depends on the value of thesubscript j.

Each transposition T_(αj,q) is distinguished from all of the othertranspositions of the function F_(α) by the fact that it is the only onethat permutes the two blocks B_(2j+1,q) and B_(2j,q) when the parameterα_(j,q) is equal to “1”. Moreover, the blocks B_(2j+1,q) and B_(2j,q) ofall of the transpositions T_(αj,q) of the same stage E_(q) are differentfrom one another and do not overlap. Thus, all of the transpositionsT_(αj,q) of the stage E_(q) can be executed in parallel. Here, thestages E_(q) are executed one after the other in descending order of thesubscripts q.

Moreover, this function F_(α) has the following characteristics:

the blocks B_(2j+1,q) and B_(2j,q) permuted by the transpositionsT_(αj,q) are adjacent blocks,

the size of the blocks B_(2i+1,q) and B_(2j,q) is equal to 2 ^(q),

the number m of transpositions T_(αj,q) per stage E_(q) is equal to 2^(d−q−1).

The notations B_(wj+1,q) and B_(2j,q) indicate that these are the(2j+1)-th and 2j-th blocks of 2^(q) bits, respectively, of the variablex. In FIG. 3, the block B_(o,q) is the least significant block of bits,the block B_(1,q) the next block and so on. Each horizontal brace inFIG. 3 encompasses the two blocks and B_(2j,q) and B_(2j,q) of atransposition T_(αj,q) for which the parameter is α_(j,q).

In the case of FIG. 3, for all of the stages E_(q) for which q is lessthan NbE−1 and for all of the transpositions T_(αj,q) of this stage, theblocks B_(2j+1,q) and B_(2j,q) are both placed within one and the sameblock B_(l,q+1) of the higher stage E_(q+1), where the subscript “l”denotes the block of the higher stage that contains the blocksB_(2i+1,q) and B_(2j,q). For example, as can be seen in FIG. 3, theblocks B_(1,q) and B_(o,q), of the stage E_(q) are consistently placedwithin the block B_(0,q+1) of the higher stage E_(q+1). Moreover, here,each block of a higher stage E_(q+1) contains at most two blocks of thelower stage E_(q).

The operation of the microprocessor 2 in order to make the execution ofarithmetic and logic instructions secure will now be described in moredetail with reference to FIG. 4.

The method begins by providing, in a step 86, the binary code 30. Duringthis step, in this example, the binary code 30 is loaded into the memory4 from the medium 6. Next, execution of the binary code 30 by themicroprocessor 2 begins.

In a step 88, each time a data item D_(i) is stored in the cache memory27, the module 28 computes the code C_(i) using the relationshipC_(i)=F_(α)(D_(i)). Next, the data item D_(i) and the code C_(i)associated therewith are both stored in the memory 27.

Each time an instruction to load a data item into one of the registersR_(i) is executed by the unit 10, in a step 90, the data item D, and thecode C_(i) are written to this register R_(i).

Prior to the execution of an arithmetic and logic instruction betweentwo data items D₁ and D₂, step 90 is executed once for the data item D₁and once for the data item D₂.

Next, each time an arithmetic and logic instruction is about to beexecuted by the unit 10, just before it is executed, in a step 94, themodule 28 checks whether there is an error in the data item D, containedin the register R_(i) identified by an operand of the instruction to beexecuted.

During this step, for each register R_(i) in question, the module 28checks, using the code C_(i) contained in the register R_(i), whether ornot the data item D_(i) currently stored in this register has an error.For example, this involves the module 28 computing a code C_(i)* usingthe relationship C_(i)*=F_(α)(D_(i)) and without using the code C_(i)stored in the register R. If the code C,* computed in this way isidentical to the code C_(i) stored in the register R_(i), then theintegrity and authenticity of the data item D_(i) are confirmed. In thatcase, the module 28 detects no error and proceeds to a step 96.Otherwise, the module 28 proceeds to a step 102.

In step 102, the module 28 triggers signalling of an execution fault.

If the module 28 detects no error, in step 96, the microprocessor 2decodes the arithmetic and logic instruction and then the unit 10executes it and stores its result D_(res−p) in the register R_(res−p).

When the executed instruction is an arithmetic and logic instructionwhose execution is secure, in parallel with step 96 or after theexecution of step 96, in a step 98, the module 28 computes the codeC_(res−t) by using only the codes C_(i) associated 10 with the dataD_(i) processed by the unit 10 in step 96. Thus, when it is the data D₁and D₂ that are processed, the code C_(res−t) is computed by combiningthe codes C₁ and C₂ stored in the registers R₁ and R₂, respectively,prior to execution of the logic instruction.

More precisely, when the executed instruction is a logic instruction,the circuit 110 computes the code C_(res−t) using the followingrelationship: C_(res−t)=C₁ & C₂, where the “&” symbol denotes theBoolean operation executed by the unit 10 in step 96.

When the executed instruction is an arithmetic instruction for which themodule 28 comprises a specific computation circuit for computing thecode C_(res−t,) then this specific circuit is selected and the codeC_(res−t) is computed by this specific circuit. Examples of suchspecific computation circuits are described in detail in the sectionthat follows.

Next, in a step 100, the module 28 checks whether the computed codeC_(res−t) corresponds to a code C_(res−p) computed from the resultD_(res−p) stored in the register R_(res−p). In the case of the circuits110, 120, 130, 149 and 160, the code C_(res−p) is computed byimplementing the relationship C_(res−p)=F_(α)(D_(res−p)). When the codeC_(res−t) is computed by the circuit 140, the code C_(res−p) is equal tothe result D_(res−p).

Next, the module 28 compares the computed codes C_(res−p) ^(and C)_(res−t). If these codes are different, the module 28 triggers theexecution of step 102. Otherwise, this means that the code C_(res−t)corresponds to the code C_(res−p) and therefore that there was no faultduring the execution of the instruction by the unit 10. In this lastcase, no signalling of an execution fault is triggered and the methodcontinues with the execution of the next instruction in the queue 22.

The execution of steps 98 and 100 allows a malfunction in the unit 10 tobe detected, because the computed codes C_(res−p) and C_(res−t) areidentical only if the unit 10 has executed the arithmetic and logicinstruction correctly. In the case of a logic instruction, this can beexplained simply by the following relationship:C_(res−p)=F_(α)(D_(res−p))=F_(α)(D₁&D₂)=F_(α)(D₁) & F_(α)(D₂)=C₁ &C₂=C_(res−t). In the case of arithmetic instructions, this can beexplained by the structure and the operations performed by theimplemented computation circuit.

If the instruction executed in step 96 is the complement operation forthe data item D₁, in step 98, the code C_(res−t) is computed using thefollowing relationship: C_(res−t)=C₁′. The remainder of the method isthen identical to what was described earlier. In the case of thecomplement operation, the codes C_(res−p) and C_(res−t) are identicalonly if the unit 10 has operated correctly. This can be demonstratedusing the following relationship:C_(res−p)=F_(α)(D_(res−p))=F_(α)(D_(1′))=F_(α)(D₁)′=C₁′=C_(res−t).

In response to an execution fault being signalled, in a step 104, themicroprocessor 2 implements one or more countermeasures. A wide range ofcountermeasures are possible. The countermeasures implemented may havevery different degrees of severity. For example, the countermeasuresthat are implemented may range from simply displaying or simply storingan error message without interrupting the normal execution of themachine code 32 as far as definitively taking the microprocessor 2 outof service. The microprocessor 2 is considered to be out of service whenit is definitively put into a state in which it is incapable ofexecuting any machine code. Between these extreme degrees of severity,there are many other possible countermeasures, such as:

using a human-machine interface to indicate detection of the faults,

immediately interrupting the execution of the machine code 32 and/orreinitializing it, and

deleting the machine code 32 from the memory 4 and/or deleting thebackup copy 40 and/or deleting the secret data.

Section IV—Examples of Computation Circuits for Computing the CodeC_(res−t):

The function F_(α) described earlier allows the bit locality to bepreserved. This denotes the property according to which the bits of adata item D_(i) that are placed within the block B_(2j+1,q) or B_(2j,q)still remain within this block. In other words, the transpositionsT_(αj,q−1) to T_(αj,0) that apply to the bits of this block B_(2j+1,q)or B_(2j,q) cannot permute a bit of this block with a bit placed outsidethis block. On the other hand, the bits of the block B_(4+1,q) orB_(2bq) can be moved within this block by applying these transpositionsT_(αj,q−1) to T_(αj,0). This stems from the fact that, for all of thestages E_(q) for which q is less than NbE−2 and for all of thetranspositions T_(αj,q) of this stage, the blocks B_(2j+1,q) andB_(2j,q) are both placed within one and the same block B_(l,q+1) of thehigher stage E_(q+1) . It is this particular property of the functionF_(α) that allows simple and fast computation circuits for computing thecode C_(res−t) to be produced, for each arithmetic instruction. This isillustrated below in the particular case of bit shift instructions, arotation instruction and an addition instruction. However, on the basisof these examples, a person skilled in the art is capable of producingother computation circuits for computing the code C_(res−t) for otherarithmetic instructions.

FIG. 5 shows the computation circuit 120 for computing the codeC_(res−t) when the arithmetic instruction executed by the unit 10 is alogic shift instruction for shifting the bits of the data item D₁ 1 bitto the left. The code C_(res−t) is computed only from the code C₁. Tosimplify FIG. 5, the circuit 120 is illustrated in the particular casewhere the size of the codes C_(i) and C_(res−t) C is equal to 8 bits. Inthis case, the number NbE of stages is equal to three. However, thedescription provided for this particular size can easily be generalizedfor all possible sizes.

In FIG. 5, the eight consecutive bits of the code C₁ are denoted by thesymbols a₀ to a₇, where a₀ is the least significant bit and a₇ is themost significant bit.

The bits of the code C_(res−t) are denoted by the symbols a′₀ to a′₇,these bits a′₀ to a′₇ being classified in the same order as the bits ofthe code C₁.

For each stage E_(q) of the function F_(α), the circuit 120 comprises acorresponding stage ED_(q). Each stage ED_(q) comprises a componentCD_(αj,q) for each transposition T_(αj,q) of the stage E_(q). Moreprecisely, each component CD_(αj,q) is associated with a respectivecorresponding transposition T_(αj,q). The components CD_(αj,q) areclassified in ascending order of their subscript, which varies dependingon the subscript j.

For q less than NbE−1, the components CD_(αj,q) are all structurallyidentical to one another. One of these components CD_(αj,q) is shown inmore detail in FIG. 6. It has three outputs 122 to 124 and four inputs126 to 129.

The output 122 delivers the result a.k′+c.k, where a, k and c are thevalues received at the inputs 126, 128 and 129, respectively.

The output 123 delivers the result b.k+c.k′, where b is the valuereceived at the input 127.

The output 124 delivers the result a.k+b.k′.

The inputs 126 and 127 of each component CD_(αj,0) of the stage ED₀ areconnected to the 2j-th and (2j+1)-th bits, respectively, of the code C₁.For q greater than zero and q less than NbE−1, the inputs 126 and 127 ofeach component CD_(αj,q) are connected, respectively, to the outputs 124of the components CD_(α2j,q−1) and CD_(α(2j+1),q−1), respectively, ofthe stage ED_(q−1).

The input 128 of each component CD_(αj,q) receives the parameter α_(j,q)of the transposition T_(αj,q) with which it is associated.

For q greater than zero, the outputs 122 and 123 of each componentCD_(αj,q) are connected to the input 129 of the componentsCD_(α(2j+1),q−1) and CD_(α2j,q−1) , respectively. For q=0, the outputs122 and 123 of each component CD_(αj,0) deliver the (2j+1)-th and 2j-thbits, respectively, of the code C_(res−t).

The stage ED_(NbE−1) comprises a single component CD_(α0,NbE−1). Thecomponent CD_(α0,NbE−1) is shown in more detail in FIG. 7. It has twooutputs 132 and 133 and three inputs 136 to 138. The outputs 132 and 133deliver the results “a” and “0”, respectively, when the input 138 isequal to “0”, where “a” is the value received at the input 136 and “0”is the value zero. The outputs 132 and 133 deliver the results “0” and“b”, respectively, when the input 138 is equal to “1”, where “b” is thevalue received at the input 137.

The inputs 136 and 137 of the component CD_(α0,NbEA−1) are connected tothe outputs 124 of the components CD_(α0,NbE−2) and CD_(α1,NbE−2)respectively. The input 138 receives the parameter α_(0,NbE−1). Theoutputs 132 and 133 are connected to the inputs 129 of the componentsCD_(α1,NbE−2) and CD_(α0,NbE−2), respectively.

When the instruction executed in step 96 is a shift instruction forshifting one bit to the left, the circuit 120 is selected in order toperform step 98. To this end, the bits of the code C₁ are delivered tothe inputs 126 and 127 of the components CD_(αj,0). In response, thecomponents CD_(αj,0) deliver the results present at their output 124 tothe inputs 126 and 127 of the higher stage ED₁. This process is repeatedstage by stage until the component CD_(α0,NbE−1) is reached. Thecomponent CD_(α0,NbE−1) then delivers the results a.k′ and b.k that arepresent at its outputs 132 and 133, respectively, to the inputs 129 ofthe components CD_(60 j,NbE−2). In response, the componentCD_(60 j,NbE−2) delivers the results a.k′+c.k and b.k+c.k′ to the inputs129 of the components CD_(αj,q) of the lower stage E_(q). The process isthen repeated stage by stage until the components CD_(αj,0) of the stageED₀ are reached. The components CD_(αj,0) then deliver the various bitsof the computed code C_(res−t) at their outputs 122 and 123. Next, themethod continues with step 100.

The operation of the circuit 120 is illustrated in FIGS. 8 to 10 in asimplified case where the size of the codes C₁ and C_(res−t) is equal tofour bits. The number NbE of stages E_(q) is therefore equal to two.Moreover, in this particular example, the parameters α_(0,0), α_(1,0)and α_(0,1) are equal to 0, 1 and 1, respectively. The bits a₀, a₁, a₂and a₃ of the data item D₁ are classified in ascending order startingfrom the least significant bit to the most significant bit. In theseconditions, with the values given above for the parameters α_(0,0),α_(1,0) and α_(0,1), the code C₁ is equal to a₀, a₁, a₃, a₂. The resultD_(res−p) itself is equal to a₂, a₁, a₀, 0. In FIGS. 8 and 9, the “?”symbol indicates that this value is unknown at this stage of operation.

FIG. 8 shows that when the values α_(0,0) and α_(1,0) are equal to 0 and1, respectively, the components CD_(α0,0) and CD_(α1,0) deliver thevalues a₃ and a₁, respectively, at their output 124. Next, in response(FIG. 9), the component CD_(α0,1) delivers the values 0 and a₁ at itsoutputs 132 and 133, respectively. Finally (FIG. 10), the componentsCD_(α0,0) and CD_(α1,0) deliver the bits of the computed code C_(res−t)at their outputs 122 and 123. In this example, the computed codeC_(res−t) is 0, a₀, a₂, a₁. It can easily be checked that this computedcode C_(res−t) is equal to F_(α)(D_(res−p)) in the absence of anexecution fault.

FIG. 11 shows the computation circuit 130 for computing the codeC_(res−t) when the arithmetic instruction executed by the unit 10 is alogic shift of 2^(r) bits to the left, 30 where r is a whole numbergreater than one and less than NbE−1. In FIG. 11, the circuit 130 isshown in the particular case where NbE=d =4 and r=2.

The following notations are used below to describe the circuit 130. Thesymbols BC_(y,r), BCI_(x,r), BCR_(x,r) and BD_(z,r) denote the blocks of2^(r) bits at the position y in the code C₁, at the position x in anintermediate code CI, at the position x in the code C_(res−p) and at theposition z in the data item D₁, respectively. The subscripts y, x and zhere are order numbers that begin at 0 and increase by 1 each time thereis a move from one block of 2^(r) bits to the next block of 2^(r) bits,moving towards the most significant bits.

The circuit 130 comprises a higher permutator 131 and a scheduler 134.The permutator 131 allows the position of the blocks of 2^(r) bitswithin the code C_(res−t) to be computed from the code C₁. To that end,here, the permutator 131 comprises stages ED_(r) to ED_(NbE−1) , whichare identical to the stages ED_(r) to _(EDNbE−)1 of the circuit 120except that instead of manipulating blocks of 1 bit, it is blocks of2^(r) bits that are manipulated.

More precisely, the inputs 126 and 127 of each component CD_(αj,q), forq greater than or equal to r, receives blocks of 2^(r) bits rather thanof a single bit. Thus, the outputs 122 and 123 of the componentsCD_(αj,r) deliver an intermediate code CI, in which each block BCI_(x,r)of 2^(r) bits is at the desired location, that is to say at the locationthat it needs to occupy in the code C_(res−t,) to the scheduler 134.

Moreover, each block BCI_(x,r) is identical to a corresponding blockBC_(y,r) of the code C₁. The reason is that the permutator 131 is onlyable to move the blocks BC_(y,r) with respect to one another in order toobtain the intermediate code CI. The permutator 131 does not permute thebits placed within a block BC_(y,r). The position of the block BC_(y,r)which is identical to the block BCI_(x,r) placed at the position x inthe code CI, is denoted y below.

The order of the 2^(r) bits within the block BCI_(x,r) is identical tothe order of the 2^(r) bits within the corresponding block BC_(y,r). Thearrangement of the 2^(r) bits within the block BC_(y,r) results fromapplication of some of the transpositions of the stages E_(r−1) to E₀ toa corresponding block BD_(z,r) when the code C₁ is calculated. Theposition of the block BD_(z,r) from which the positions of the bitsplaced within the block BC_(y,r) are computed is denoted z below. Theposition z of the block BD_(z,r) is not necessarily the same as theposition y of the block BC_(y,r) because the transpositions of thestages E_(NbE−1) to E_(r) may have moved this block BD_(z,r) beforeapplying the transpositions of the next stages thereto. The compositionof the transpositions T_(αj,q) of the stages E_(r−1) to E₀ applied,during construction of the code C₁, to the bits placed within the blockBD_(z,r) in order to obtain the block BC_(y,r) is denoted F_(αcy). Thekey αc_(y) is a subset of the key a that contains only the parameters ofthe transpositions of the function F_(αcy). The key αc_(y) is called the“current key” below because it is the one that explains the presentarrangement of the bits within the block BC_(y,r). The key αc_(y) isdependent on the position y of the block BC_(y,r). The parameter of thetransposition T_(αj,q) that is contained in the current key αc_(y) isalso denoted αc_(j,q) below.

In the code C_(resp−p), the arrangement of the 2^(r) bits within theblock BCR_(x,r) results from application, to the corresponding blockBD_(z,r), of some of the transpositions of the stages E_(r−1) to E_(o).The corresponding block BD_(z,r) is the same as the one that correspondsto the block BC_(y,r). The composition of the transpositions T_(αj,q) ofthe stages E_(r−1) to E₀ applied, during construction of the codeC_(resp−p), to the bits placed within the block BD_(z,r) in order toobtain the block BCR_(x,r) is denoted F_(αsx). The key αs_(x) is calledthe “desired key” below because it is the one that determines thearrangement of the bits within the block BCR_(x,r). The key as_(x) isdependent on the position x of the block BCR_(x,r). The parameter of thetransposition T_(αj,q) that is contained in the desired key as_(y) isalso denoted as_(j,q) below.

The result of the explanations above is that the arrangement of the2^(r) bits within the block BCI_(x,r) is not necessarily identical tothe arrangement of the 2^(r) bits within the block BCR_(x,r) thatoccupies the same position x in the code C_(res−t.) The reason is thatthe current key ac_(y) is not necessarily identical to the desired keyαs_(x). The scheduler 134 rearranges the order of the bits within eachblock BCI_(x,r) to obtain the desired block BCR_(x,r).

To explain this, let us suppose that the data item D₁ comprises fourblocks, of 2 ^(r) bits each, denoted in the order BD_(3,r), BD_(2,r),BD_(1,r) and BD_(0,r). It is also supposed in this example that NbE=d=4and r=2 and that the parameters α_(0,3), α_(1,2), α_(0,2) are equal to1, 1 and 0, respectively. The computation of the code C₁ is broken downinto first and second successive phases. During the first phase, it isthe transpositions of the stages E_(NbE−1) to E_(r) that are applied.Thus, during this first phase, only the whole blocks BD_(z,r) of thedata item D₁ are permuted. In this example, at the end of this firstphase, the order of the blocks BD_(z,r) is as follows: BD_(0,r),BD_(1,r), BD_(3,r) and BD_(2,r).

Next, during the second phase, it is the transpositions of the stagesE_(r−1) to E₀ that are applied. This second phase permutes only the bitswithin each of the blocks BD_(z,r). During this second phase, no appliedtransposition moves a bit placed within a block BD_(z,r) to anotherblock. At the end of this second phase, the code C₁ is obtained. Thefour blocks, of 2^(r) bits each, of the code C₁ obtained are denoted inthe order BC_(3,r), BC_(2,r), BC_(1,r) and BC_(0,r).

During the second phase, the bits of the block BD_(0,r) are permuted byapplying the transpositions of the stages E_(r−1) to E₀, which areapplied only to the 2^(r) most significant bits. This stems from thefact that at the end of the first phase, the block BD_(0,r) is at thelocation of the most significant bits, that is to say at the positiony=3 here. The composition of the transpositions of the stages E_(r−1) toE₀ that apply only to the 2^(r) most significant bits is denotedF_(αc3). Similarly, the compositions of transposition of the stagesE_(r−1) to E₀ that permute only the bits placed within the blocksBD_(1,r), BD_(3,r) and BD_(2,r) respectively are denoted F_(αc2),F_(αc1) and F_(αc0). It is thus noted that the block BC_(3,r) of thecode C₁ is the result of application of the function F_(αc3) to the bitsof the block BD_(0,r). Similarly, the blocks BC_(2,r), BC_(1,r) andBC_(0,r) of the code C₁ are the results of application of the functionsF_(αc2), F_(αc1) and F_(αc0) to the blocks BD_(1,r), BD_(3,r) andBD_(2,r), respectively.

Following the logic shift of 2^(r) bits to the left, the resultD_(res−p) is equal to the concatenation, in order, of the blocksBD_(2,r), BD_(1,r), BD_(0,r) and of a block [0] of 2^(r) null bits.

Application of the function F_(α) to the result D_(res−p) in order tocompute the code C_(res−p) is broken down, similarly, into a first phasethen a second phase. At the end of the first phase, the blocks of theresult D_(res−p) are classified in the following order: [0], BD_(0,r),BD_(2,r) and BD_(1,r). During the second phase, the functions F_(αs3) toF_(αs0) are applied to the blocks [0], BD_(0,r), BD_(2,r) and BD_(1,r),respectively. Thus, the blocks BCR_(3,r), BCR_(2,r), BCR_(1,4),BCR_(0,r) of the code C_(res−p) are the results of application of thefunctions F_(αs3) to F_(αs0) to the blocks [0], BD_(0,r), BD_(2,r) andBD_(1,r), respectively.

The intermediate code CI delivered by the permutator 131 is theconcatenation, in order, of the blocks [0], BC_(0,r), BC_(2,r) andBC_(1,r). The block BCI_(2,r) is the result of application of thefunction F_(αc3) to the block BD_(0,r). The desired block BCR_(2,r),which occupies the same position in the code C_(resp−p), is the resultof application of the function F_(αs2) to the block BD_(0,r). Thus, inthe case of the block BCI_(2,r), the role of the scheduler 134 is tocancel application of the transpositions of the function F_(ac3) and toapply the transpositions of the function F₂ instead in order to obtainthe desired block BCR_(2,r).

For this, for example, for each stage E_(r−1) to E₀ of the functionF_(α), the scheduler 134 comprises a stage corresponding to EO_(r−1) toEO₀. Each stage EO_(q) comprises 2^(d−q−1) comparators CO_(j,q). Eachcomparator CO_(j,q) is associated with the corresponding transpositionT_(αj,q) that permutes the blocks BCI_(2j+1,q) and BCI_(2j,q) of theintermediate code CI when the value of its parameter is equal to one.The size of the blocks BCI_(2j+1,q) and BCI_(2j,q) is equal to 2^(q), Tosimplify FIG. 11, only some of these comparators are shown. Eachcomparator CO_(j,q) receives the block BCI_(j,q+1), the parameterac_(j,q) of the current key and the parameter αs_(j,q) of the desiredkey. If these two parameters are equal, the output of the comparatorCO_(j,q) delivers the blocks BCI_(2j+1,q) and BCI_(2j,q) to thecomparators CO_(2j+1,q−1) and CO_(2j,q−1), respectively. In other words,the transpositions T_(αcj,q) and T_(αsj,q) are identical and no changeof order of the bits is necessary. If the two parameters αc_(j,q) andαs_(j,q) are different, the comparator CO_(j,q) permutes the positionsof the blocks BCI_(2j+1,q) and BCI_(2j,q) before delivering them to thecomparators CO_(2j+1,q−1) and CO_(2j,q−1), respectively. This replacesthe transposition T_(αcj,q) with the transposition T_(αsj,q). Proceedingthus by moving from the stage EO_(r−1) to the stage EO₀, the codeC_(rest−t) is obtained at the outputs of the comparators CO_(j,0) of thestage EO₀. In FIG. 11, the bits of the code C_(res−t-t) computed by thecircuit 130 are denoted by the references a₇ to a₀.

The current αc and desired as keys are obtained as follows, for example.The key α is divided into 2^(r) blocks BK_(z,r) of 2^(r) bits each. Eachblock BK_(z,r) contains only the parameters α_(j,r−1) to α_(j,0) of thetranspositions to be applied, when the code C₁ is calculated, to thebits placed within the respective block BD_(z,r) of the data item D₁.Within each block BK_(z,r), the parameters α_(j,r−1) to α_(j,0) areclassified in a predetermined order. For example, here, they are firstof all classified in descending order of stages and the variousparameters α_(j,q) of one and the same stage E_(q) are also classifiedin descending order of subscript j. The desired key as is equal to thekey a in which the parameters are classified as described above. Next,the various blocks BK_(z,r) are permuted, for example, by a circuitidentical to the permutator 131. The key containing the blocks BK_(z,r)permuted in this way is equal to the current key αc. The parametersαc_(j,q) and as_(j,q) are the parameters that occupy the same positionin the current key ac and the desired key as, respectively.

FIG. 12 shows the computation circuit 140 for computing the codeC_(res−t) from the code C₁ when the arithmetic instruction executed bythe unit 10 is a shift instruction for shifting the data item D₁a_(NbE−1)2^(NbE−1)+ . . . +a_(q)2^(q)+ . . . +a₀2⁰ bits to the left. Thecoefficients a_(NbE−1) to a₀ are coefficients whose Boolean values aredetermined and fixed in advance. The circuit 140 computes a codeC_(res−t) that is equal to the result D_(res−p) if no execution faultoccurs. Thus, in this embodiment, in step 100, the code C_(res−p) isequal to the result D_(res−p).

For each stage E_(q) of the function F_(α), the circuit 140 comprises astage ET_(q). Each stage ET_(q) has four inputs 142 to 145 and threeoutputs 146 to 148. The input 142 receives a code CP_(q) to be permuted.Each code CP_(q) comprises 2^(d−q) blocks BCP_(j,q), of 2^(q) bits each.These blocks BCP_(j,q) do not overlap and are immediately consecutive.The subscript j is equal to the order number of the block BCP_(j,q)counting from the block BCP_(0,q) that contains the least significantbits.

Each input 142 is connected to the output 146 of the previous stageET_(q+1), except the input 142 of the stage ET_(NbE−1) , which receivesthe code C₁ at its input 142.

The input 143 receives the parameters α_(j,q) of the stage E_(q) of thefunction F_(α). Here, to this end, the input 143 of the stage ET_(q) isconnected to the output 147 of the previous stage ET_(q+1).

The input 144 receives a permutation key K_(ETq) that contains only theparameters α_(j,q−1) to α_(j,0) that are required for implementing thetranspositions of the stages E_(q−1) to E₀ of the function F_(α). Thiskey K_(ETq) is divided into 2^(d−q) blocks BK_(j,q) of 2^(q) bits each.Each block BK_(j,q) contains only the parameters α_(j,q−1) to α_(j,0) ofthe transpositions to be applied to the bits placed within the blockB_(j,q) of the data item D₁. Within each block BK_(j,q), the parametersα_(j,q−1) to α_(j,0) are classified in a predetermined order. Forexample, here, they are first of all classified in descending order ofstages and the various parameters of one and the same stage E_(q−1) arealso classified in descending order of subscript j. For example, let ussuppose that q=3, the transpositions to be applied to the block B_(2,3)are, successively, the transpositions T_(α2,2), T_(α5,1), T_(α4,1),T_(α11,0), T_(α10,0), T_(α9,0) and T_(α8,0). This follows from theorganization of the transpositions T_(αj,q) in the function F_(α)described with reference to FIG. 3. From then on, the block BK_(2,3)comprises the parameters α_(2,2), α_(5,1), α_(4,1), α_(11,0), α_(10,0),α_(9,0) and α_(8,0) and an additional 0 to reach the size of 2³ bits.

The input 145 receives the coefficient a_(q) of the shift to be appliedto the data item D₁.

The stage ET_(q) comprises two permutators 150 and 151, two shiftregisters 154 and 155 and two multiplexers 158 to 159.

The permutator 150 executes the transpositions of the stage E_(q) on thedata received at its input 142. In other words, the permutator 150executes the following function: T_(αm,q)o . . . oT_(αj,q)o . . .oT_(α0,q)(CP_(q)), where m is equal to 2^(d−q−1).

To do this, the permutator 151 is connected to the input 142 in order toreceive the code CP_(q) to be permuted and to the input 143 in order toreceive the parameters α_(m,q) to α_(0,q).

The code permuted by the permutator 150 is transmitted directly to afirst input of the multiplexer 158 and, in parallel, to an input of theregister 154.

The register 154 performs a logic shift of 2^(q) bits to the left on thebits received at its input in order to obtain a permuted and shiftedcode, which is delivered to a second input of the multiplexer 158.

The multiplexer 158 connects the first input directly to the output 146if the coefficient α_(q) received at the input 145 is equal to zero. Ifthe coefficient a_(q) received is equal to one, the multiplexer 158connects its second input directly to the output 146.

The permutator 151 and the register 155 are identical to the permutator150 and the register 154, respectively. They therefore perform the sameoperations as the permutator 150 and the register 154, respectively, butapplied to the key K_(ETq), that is to say to the key received at theinput 144.

The multiplexer 159 selects the permuted key delivered by the permutator151 if the coefficient α_(q) is equal to zero. Otherwise, it selects thepermuted and shifted key delivered by the register 155. Moreover, usingthe key selected on the basis of the coefficient a_(q), the multiplexer159 delivers to the output 147 the parameters α_(j,q−1) required forconfiguring the transpositions of the stage ET_(q−). It also deliversthe key K_(ETq−1) to the output 148.

The operation of the circuit 140 is as follows: the permutator 150 ofthe stage ET_(q) cancels the effect of the transpositions T_(j,q) of thestage E_(q) that is applied to the data item D₁ when the code C₁ iscalculated. It should be remembered here that T_(j,q) o T_(j,q) is theidentity function. Thus, in the permuted code delivered by thepermutator 150, the blocks BCP_(j,q) occupy the same position as the onethat they had in the data item D₁. However, within each block BCP_(j,q),the position of the bits corresponds to the result of application of thetranspositions of the stages E_(j,q−1) to E_(j,0) to the bits placedwithin the block B_(j,q) of the data item D₁. Thus, the order of thebits within the blocks BCP_(j,q) is not the same as the order of thebits within the block B_(j,q) of the data item D₁. However, this is notimportant for the application of a logic shift of 2^(q) bits to theleft, because such a shift shifts only blocks of 2^(q) bits. Thus,applying the shift of 2^(q) bits in the stage ET_(q) allows the codeC_(res−t) to be computed without this requiring:

the data item D₁ to be found from the code C₁, then

the various logic shifts of 2^(q) bits to be applied to this data itemD₁ that has been found.

In this embodiment, the parameters α_(j,q−1) to α_(j,0) required forconfiguring all of the transpositions T_(j,q−1) to T_(j,0) to be appliedto the bits placed within a block BCP_(j,q) received at the input 142are placed within the block BK_(j,q) of the same size and occupying thesame position j in the key K_(ETq) received at the input 144. At theoutput 148, this relationship between the position of the blocksBCP_(j,q−1) and the position of the blocks BK_(j,q−1) is preserved. Inother words, in the key delivered at the output 148, the blockBK_(j,q−1) contains all of the parameters α_(j,q−2) to α_(j,0) requiredfor configuring the transpositions to be applied to the bits placedwithin the block BCP_(j,q−1) delivered at the output 146.

To preserve this relationship, the permutator 151 and the register 155apply the same transpositions and the same shifts, respectively, asthose applied to the code CP_(q), but this time to the key K_(ETq)received. Next, the multiplexer 159 extracts from the selected key theparameters α_(j,q−1) to be delivered to the output 147. The multiplexer159 also extracts the parameters α_(j,q−2) to α_(j,0) and generates thekey K_(ETq−1) that is delivered to the output 148. Here, the multiplexer159 identifies the parameters α_(j,q) to be extracted on the basis oftheir position in the selected key.

The output 146 of the stage ET₀ delivers the code C_(res−t) to becompared with the code C_(res−p) in order to check, in step 100, thatthe execution of the shift instruction has taken place without a fault.

FIG. 13 shows a computation circuit 149 for computing the code C_(res−t)when the arithmetic instruction executed by the unit 10 is a rotationinstruction for rotating the bits of the data item D₁ one bit to theleft. This rotation instruction is identical to the shift instructionfor shifting one bit to the left, except that the most significant bitof the data item D₁ is reinjected on the right and therefore becomes theleast significant bit after this rotation has been executed.

For each stage E_(q) of the function F_(α), the circuit 149 comprises acorresponding stage ER_(q). Each stage ER_(q) comprises a componentCC_(αj,q) for each transposition T_(αj,q) of the stage E_(q). Moreprecisely, each component CC_(αj,q) is associated with a respectivecorresponding transposition T_(αj,q).

For q less than NbE−2, each component CC_(αj,q) is identical to thecomponent CD_(αj,q) of the circuit 120. The component CC_(α0,NbE−1) isshown in FIG. 13 in the particular case where NbE is equal to three. InFIG. 13, it is therefore denoted by the reference CR_(α0,2). Thecomponent CC_(α0,NbE−1) has two inputs 152 and 153 and two outputs 156and 157. The inputs 152 and 153 are connected to the outputs 124 of thecomponents CC_(α0,NbE−2) and CC_(α1,NbE−2) respectively. The outputs 156and 157 are connected to the inputs 129 of the components CC_(α1,NbE−2)and CC_(α0,NbE−2), respectively.

The output 156 consistently delivers the same value as that received atthe input 152. The output 157 consistently delivers the same value asthat received at the input 153. In other words, the componentCC_(α0,NbE−1) consistently reverses the position of the bits received atits inputs. This allows the most significant bit to be reinjected at thelocation intended to receive the least significant bit.

The operation of the circuit 149 is derived from the explanations givenfor the circuit 120.

FIG. 14 shows the computation circuit 160 for computing the codeC_(res−t) when the arithmetic instruction executed by the unit 10 is anaddition instruction for adding the data D₁ and D₂. This time, the codeC_(res−t) is computed from the codes C₁ and C₂. The circuit 160 computesthe code C_(res−t), which, in the absence of an execution fault, isequal to F_(α)(D_(res−p)).

FIG. 14 shows the circuit 160 in the particular case where the size ofthe codes C₁ and C₂ and of the code C_(res−t) is equal to 8 bits. Inthis figure, the 8 bits of the codes C₁ and C₂ are denoted, in the ordertowards the most significant bit, α₀ to α₇ and b₀ to b₇, respectively.The 8 bits of the computed code C_(res−t) are denoted, in the sameorder, s₀ to s₇.

The circuit 160 comprises a stage 162 of adders and a carry look-aheadunit 164.

The stage 162 comprises an adder AD_(p) for each pair of bits a_(p),b_(p) to be added. The subscript p denotes the position of the bits inthe codes C₁, C₂ and C_(res−t). These adders AD_(p) are structurallyidentical to one another and differ from one another only in the bitsa_(p) and b_(y) that they add.

The adder AD_(p) is shown in more detail in FIG. 15. The adder AD_(p)has three inputs 170 to 172 and three outputs 174 to 176. The inputs 170and 171 receive the bits a_(p) and b_(p), respectively.

The input 172 receives a carry c_(p) to be used in the addition of thebits a_(p) and b_(p).

The output 174 delivers the result a_(p).b_(p). The output 175 deliversthe result a_(p)+b_(p). The output 176 delivers the bit s_(p). The bits_(p) is computed by the adder AD_(p) using the following relationship:s_(p)=a_(p)XOR b_(p)XOR c_(p).

The function of the unit 164 is to rapidly propagate the various carriesc_(p) to be used by the adders AD_(p). To that end, here, the unit 164computes the carries c_(p) from the information delivered at the outputs174 and 175 of each adder AD_(p).

In this embodiment, the unit 164 comprises a stage EA_(q) for each stageE_(q) of the function F_(α). Each stage EA_(q) comprises 2 ^(d−q−1)components CA_(αj,q), where the subscript j is the order number of thecomponent CA_(αj,q) in the stage EA_(q). Each component CA_(αj,q) isassociated with a respective transposition T_(αj,q) of the functionF_(α).

Here, the components CA_(αj,q) are all structurally identical to oneanother and are distinguished only by their connection to the othercomponents of the circuit 160.

The component CA_(αj,q) is shown in more detail in FIG. 16. Thecomponent CA_(αj,q) has six inputs 180 to 185 and four outputs 188 to191. The output 188 delivers the result(G_(r)+P_(r).G_(I)).k+(G_(I)+P_(I).G_(r)).k′, where G_(r), P_(r), G_(I),P_(I) and k are the values received at the inputs 180, 181, 182, 183 and185, respectively.

The output 189 delivers the result P_(I).P_(r). The output 190 deliversthe result c_(i).k′+(G_(I)+P_(I).c_(i)).k, where c_(i) is the valuereceived at the input 184. The output 191 delivers the resultc_(i).k+(G_(r)+P_(r).C₁).k′.

The input 185 receives the parameter α_(j,q) of the transpositionT_(αj,q) associated with this component CA_(αj,q).

The inputs 180 and 181 of each component CA_(αj,0) of the stage EA₀ areconnected to the outputs 174 and 175, respectively, of the adderAD_(2j). The inputs 182 and 183 of each component CA_(αj,0) of the stageEA₀ are connected to the outputs 174 and 175, respectively, of the adderAD_(2j+1).

The outputs 190 and 191 of each component CA_(αj,0) of the stage EA₀ areconnected to the inputs 172 of the adders AD_(2j) and AD_(2j+1) ,respectively.

For q greater than zero:

the inputs 180 and 181 of each component CA_(αj,q) of the stage EA_(q)are connected to the outputs 188 and 189, respectively, of the componentCA_(α2j,q−1) of the lower stage EA_(q−1),

the inputs 182 and 183 of each component CA_(αj,q) are connected to theoutputs 188 and 189 of the component CA_(α(2j+1))_(,chi) of the lowerstage EA_(q−1),

the outputs 190 and 191 of each component CA_(αj,q) are connected to theinputs 184 of the components CA_(α2j,q−1) and CA_(α(2j+1),q−1),respectively, of the stage ET_(q−1)

The outputs 188 and 189 of the component CA_(α0,NbE−1) are not used. Theinput 184 of the component CA_(α0,NbE−1) allows the carry computed byanother circuit, for example identical to the circuit 160, to bereceived. This allows multiple circuits 160 to be linked to one another,so as to perform additions on data of greater size.

The unit 164 functions as a conventional carry look-ahead computationunit. Here, however, this conventional unit is modified to take accountof the transpositions T_(αj,q) and therefore the parameters α_(j,q) thatare used for computing the code C_(res−t.) In summary, the componentsCA_(αj,q) propagate the carry to the right when the parameter α_(j,q) isequal to zero and to the left when the parameter α_(j,q) is equal toone.

Section IV—Variants

Variants of the Function Q_(α):

In the relationship Q_(a)(D_(i))=P o F_(α)(D_(i)), the function P is notnecessarily the identity function. For example, the function P is acompression function that constructs, from each of the bits of theresult F_(α)(D_(i)), a code C₁ whose size, in terms of the number ofbits, is less than 2^(d). The reason is that when the function P is theidentity function, the size of the code C_(i) is equal to the size ofthe data item D_(i), that is to say equal to 2^(d). Now, in somecontexts, it is desirable to reduce the size of the code C. For example,this is desirable in order to reduce the space that it can take up inthe cache memory 27. By way of illustration, to this end, the function Pis the 30 function that performs the following operations:

1) the function P divides the result F_(α)(D_(i)) into two blocks P₀ andp₁ of bits of the same size, then,

2) the function P performs an “EXCLUSIVE-OR” between the blocks P₀ andp₁. In this case, the size of the code C_(i) is halved and is equal to2^(d−1).

Many other compression functions P are possible. The function P can alsobe different from the identity function and from a compression function.For example, the function P is an encryption or other function.

When the function P is different from the identity function, each of thecomputation circuits described here is broken down into a first and asecond subcircuit. The first subcircuit is identical to one of thecomputation circuits described earlier. This first subcircuit thereforedelivers a code C_(res−int), which, in the absence of an executionfault, is consistently equal to the result F_(α)(D_(res−p)). The secondsubcircuit applies the predetermined function P to the code C_(res−int)in order to obtain the code C_(res−t).

The various variants are described below in the particular case wherethe function P is equal to the identity function. However, thesevariants also apply to the case where the function P is different fromthe identity function.

As a variant, the transposition T_(αj,q) permutes the blocks B_(2j+1,q)and B_(2j,q) when the parameter α_(j,q) =0 and does not permute themwhen the parameter α_(j,q)=1.

The function F_(α) has been described in the particular case where thestages of transpositions first transpose the blocks of greater size andend by transposing the blocks of smaller size. However, as a variant,the stages E_(q) of transpositions can be executed and classified inreverse order. In this case, the transpositions of smaller size areapplied first, ending by applying the transposition T_(α0,NbE−1) ofgreater size. The order in which the various stages E_(q) are classifieddoes not modify the bit locality property described earlier. Thus, evenwhen the order of the stages E_(q) is reversed, it is possible toconstruct fast and simple computation circuits for computing the codeC_(res−t) for arithmetic instructions.

As a variant, one or more stages of the function F_(α) are omitted.

Some of the transpositions T_(αj,q) can be omitted. In this case, atleast one of the stages comprises fewer than 2^(d−q−1) transpositionsT_(αj,q).

Variants of the Computation Circuits for Computing the Code C_(res−t):

The teaching provided here in the case of a few arithmetic instructions,such as shifts, rotations and additions, can be applied to otherarithmetic instructions. In particular, it is possible to take theteaching provided in these particular cases as a basis for developingcomputation circuits for computing a code C_(res−t) for other arithmeticinstructions. For example, the circuit 120 can be modified to computethe code C_(res−t) corresponding to a logic shift instruction forshifting 1 bit to the right. In practice, it is sufficient, for thispurpose, to retain the same circuit as the circuit 120, but to send thecomplement of the parameter α_(j,q), that is to say the parameterα_(j,q)′, to the input 128 or 138 of each of the components CD_(αj,q).

Equally, it is possible to construct a computation circuit for computingthe code C_(res−t) for an arithmetic shift instruction for shifting onebit to the left or to the right. An arithmetic shift is distinguishedfrom the logic shifts described earlier by the fact that the mostsignificant bit remains unchanged and is therefore not shifted, unlikethe other bits.

In the circuit 140, each 2^(q)-bit shift register can be replaced by aregister that performs a rotation of 2^(q) bits. The circuit thusobtained computes the result C_(res−t), which corresponds to a rotationinstruction for rotating the bits of the data item D₁a_(NbE−1)2^(NbE−1)+ . . . +a_(q)2^(q)+ . . . +a₀ bits to the left. Byreplacing these registers with registers that perform a shift to theright or a rotation to the right, the circuit obtained computes the codeC_(res−t) corresponding to a shift or rotation instruction for shiftingor rotating a_(NbE−1)2^(NbE−1)+ . . . +a_(q)2^(q)+ . . . +a₀ bits to theright.

It is also possible to link multiple computation circuits in order tocompute the code C_(res−t) for an operation that corresponds to thecomposition of multiple suboperations for each of which there is alreadya computation circuit for computing the code C_(res−t). For example, ifthe executed operation is a logic shift instruction for shifting twobits to the left, the code C_(res−t) is computed by applying the circuit120 twice. To that end, the code C₁ is first injected at the inputs ofthe circuit 120 and a first intermediate code CI_(cres−t) is obtained.Next, the code CI_(cres−t) is injected at the inputs of this samecircuit 120 in order to obtain the desired code C_(res−t).

Similarly, the circuits 120 and 130 can be linked in order to computethe code C_(res−t) for a logic shift to the left for any number of bits.

If the coefficients a_(q) are consistently constant, as a variant, theinputs 145 of the circuit 140 are omitted. In this case, thecoefficients α_(q) are wired inside each stage ET_(q). For example, themultiplexer 158 is omitted. If the value of the coefficient a_(q) isconsistently equal to one, then the output of the register 154 isdirectly connected to the output 146. The multiplexer 159 is alsosimplified, since it consistently selects the output of the register155. Conversely, if the coefficient a_(q) is consistently equal to zero,then the output of the permutator 150 is directly connected to theoutput 146 and the register 154 is omitted. Equally, the register 155 isomitted.

As a variant, the component CA_(α0,NbE−1) of the circuit 160 is devoidof the outputs 188 and 189, which are not used.

As a variant, the output 175 of the component AD_(p) delivers the resultα_(p) XOR b_(y) rather than the result α_(p)+b_(p).

In a simplified embodiment, only the execution of some arithmetic orlogic instructions is secure. For example, only the execution of one ofthe following instructions is made secure by implementing the methoddescribed here: the bit shift instruction, the bit rotation instructionand the bit addition instruction. The execution of the otherinstructions, such as the logic instructions, is thus not secure. Inthis latter case, this means that, for these other instructions, no code_(res−t) is calculated and step 100 is omitted.

Other Variants:

The module 28 is not necessarily a hardware module of a single block. Asa variant, it is made up of multiple hardware submodules that eachperform one of the specific functions of the module 28. These hardwaresubmodules are thus preferably embedded as close as possible to the datathat they process. For example, in this case, the hardware submodulethat computes the code C_(i) associated with each data item D₁ isembedded in the cache memory 27. From then on, the code C_(i) associatedwith each data item D_(i) stored in the cache memory 27 is computedlocally in this cache memory.

As a variant, each instruction of the machine code is also associatedwith an integrity code F_(α)(I_(i)) computed from the value of theloaded instruction I_(i). This code F_(α)(I_(i)) is verified just beforethe unit 10 executes the instruction I_(i). This allows the signallingof an execution fault to be triggered if the instruction I_(i) ismodified in the queue 22.

It is possible to associate the code C_(i) with the data item D_(i) invarious ways. For example, instead of storing the code C_(i) in the sameregister R_(i) as the one that contains the data item D₁, the code C_(i)is stored in a register RC_(i) associated with the register R_(i) ratherthan in the register R_(i).

The secret key a can be modified, for example, at regular intervals.

Other embodiments of step 100 are possible. For example, the module 28computes, as in the case of the circuit 140, a code C_(res−t) that isequal to the result D_(res−p) in the absence of an execution fault. Inthis case, the code C_(res−t) is computed using the followingrelationship: C_(res−t)=F_(a) ⁻¹(C I_(res−t)), where:

the function F_(α) ⁻¹ is the inverse of the function F_(α), and

CI_(res−t) is the code computed, for example, by the circuits 120, 130,149 or 160.

The various computation circuits for computing the code C_(res−t) thatare described here can be implemented independently of one another.

Section V—Advantages of the Described Embodiments

Computing the code C; using a secret key α makes the method forexecuting the machine code more robust in the face of attempted attacks.The reason is that the attacker then has greater difficulty infalsifying the code C_(res−t) so that it corresponds to an expected codewhen an execution fault has been deliberately introduced. Thus, themethods described earlier have the same advantages in terms ofrobustness as the one described in the article by DEMEYER2019. Moreover,the use of a function F_(α) that has the locality property describedearlier makes it possible to obtain computation circuits for computingthe code C_(res−t) that are simpler and faster than those required forimplementing the method of the article DEMEYER2019.

The circuits 120, 130, 140, 149 and 160 each allow simple and fastcomputation of the code C_(res−t) corresponding to a specific arithmeticinstruction.

The circuit 110 also allows simple and fast computation of the codeC_(rest−t) for all Boolean operations.

1. Microprocessor equipped with an arithmetic and logic unit and with a hardware security module, wherein: a) the arithmetic and logic unit is capable of executing an arithmetic instruction, comprising an opcode and one or more operands, that, when executed by the arithmetic and logic unit of the microprocessor, causes a mathematical operation D₁*D₂* . . . *D_(n) to be performed and the result of this operation to be stored in a register R_(res−p) where: the subscript n is equal to the number of data items D₁ processed by the arithmetic instruction, the subscript n being greater than or equal to one, D₁ to D_(n) are data items that are stored in registers R₁ to R_(n), respectively, of the microprocessor, the size, in terms of the number of bits, of each of these data items D_(i) being equal to 2^(d), where d is an integer greater than two, the registers R₁ to R_(n) are the registers denoted by the operands of the arithmetic instruction, the symbol “*” is the arithmetic operation denoted by the opcode of the arithmetic instruction, b) the microprocessor is configured to perform the following operations: 1) for each data item D_(i), computation of a code C_(i) using a relationship C_(i)=Q_(a)(D_(i)) and association of the computed code C_(i) with the data item D_(i), the function Q_(α) being a pre-programmed function configured by a secret key a that is pre-stored in the microprocessor and known only to the microprocessor, 2) each time an instruction for loading a data item D_(i) into a register R_(i) of the microprocessor is executed by the microprocessor, the loaded data item D_(i) is stored in the register R_(i) and the code C_(i) associated therewith is stored in the same register R_(i) or in a register associated with the register R_(i), then 3) execution of the arithmetic instruction and storage of the result D_(res−p) of this execution in the register R_(res−p), and computation, by the security module, of a code C_(res−t) using the codes C₁, C₂, . . . , C_(n) and without using the result D_(res-p), then 4) checking that the computed code C_(res−t) corresponds to a code C_(res−p) obtained from the result D_(res−p) and triggering of the signalling of an execution fault if the code C_(res−t) does not correspond to the code C_(res−p) and, otherwise, suppressing this signalling, wherein the function Q_(α) is defined by the following relationship: Q_(a)(D_(i))=P o F_(α)(D_(i)), where P is a predetermined function and F_(α) is a function defined by the following relationship: F_(α)(D_(i))=E₀ o . . . o E_(q) o . . . o E_(NbE−1)(D_(i)), where each function E_(q) is a stage of transpositions and the index q is an order number between zero and NbE−1, where NbE is a whole number greater than one and less than or equal to d, each stage E_(q) of transpositions being defined by the following relationship: E_(q)(x)=T_(αm,q) o . . . o T_(αj,q) o . . . o T_(α1,q) o T_(α,q)(x), where: x is a variable whose size, in terms of the number of bits, is equal to the size of the data item D_(i), T_(αj,q) is a conditional transposition, configured by the parameter α_(j,q), that permutes two blocks of bits B_(2j+1,q) and B_(2j,q) of the variable x when the parameter α_(j,q) is equal to a first value and that does not permute these two blocks of bits when the parameter α_(j,q) is equal to a second value, the transposition T_(αj,q) being distinguished from all of the other transpositions of the function F_(α) by the fact that it is the only one that permutes the two blocks B_(2j+1,q) and B_(2j,q) when the parameter α_(j,q) is equal to the first value, the blocks B_(2j+1,q) and B_(2j,q) of all of the transpositions T_(αj,q) of the stage E_(q) being different from one another and not overlapping in such a way that all of the transpositions T_(αj,q) of the stage E_(q) can be executed in parallel, “m+1” is the total number of transpositions T_(αj,q) of the stage E_(q), “j” is an order number identifying the transposition T_(αj,q) among the other transpositions of the stage E_(q), the symbol “o” denotes the function-composition operation, the concatenation of the bits of all of the parameters α_(j,q) of all of the stages E_(q) is equal to the value of the secret key α, and for all of the stages E_(q) for which q is less than NbE−1 and for all of the transpositions T_(αj,q) of this stage, the blocks B_(2j+1,q) and B_(2j,q) are placed within one and the same block of greater size permuted by a transposition of the higher stage E_(q+1) when the parameter of this transposition of the higher stage E_(q+1) is equal to the first value.
 2. Microprocessor according to claim 1, wherein the number NbE is equal to d, the sizes of the blocks permuted by all of the transpositions T_(αj,q) of one and the same stage E_(q) are equal to 2^(q), the blocks B_(2j+1,q) and B_(2j,q) permuted by each transposition T_(αj,q) are adjacent.
 3. Microprocessor according to claim 2, wherein the arithmetic instruction is a shift instruction for shifting the bits of the data item D₁ one bit to the left or to the right and the code C_(res−t) is equal to Q_(a)(D_(res−p)) in the absence of an execution fault, the security module comprising, to this end, a computation circuit for computing the code C_(rest−t) from the code C₁ and without using the result D_(res−p), this computation circuit comprising: for each stage E_(q) of the function F_(α), a stage ED_(q) comprising components CD_(αj,q), each component CD_(αj,q) being associated with a respective transposition T_(αj,q) of the stage E_(q), the components CD_(αj,q) of the stages ED₀ to ED_(NbE−2) all being structurally identical to one another and each having a first, a second, a third and a fourth input and a first, a second and a third output: the first output delivering the result a.k′+c.k, where: “a”, “k” and “c” are the values received at the first, third and fourth inputs, respectively, of this component, the symbol “ + ” denotes the “OR” logic operation, the symbol “ . ” denotes the “AND” logic operation, and the symbol “ ′ ” denotes the “NOT” logic operation, the second output delivering the result b.k+c.k′, where “b” is the value received at the second input of this component, the third output delivering the result a.k+b.k′, the component CD_(αj,NbE−1) of the stage ED_(NbE−1) having a first, a second and a third input and a first and a second output: the first and second outputs delivering the results “a” and “0”, respectively, when the third input is equal to “0”, where “a” is the value received at the first input and “0” is the value zero, and the first and second outputs delivering the results “0” and “b”, respectively, when the third input is equal to “1”, where “b” is the value received at the second input, for q equal to zero, the first and second inputs of each component CD_(αj,0) of the stage ED₀ being connected to the 2j-th bit and to the (2j+1)-th bit, respectively, of the code C₁, and for q greater than zero, the first and second inputs of each component CD_(αj,q) of the stage ED_(q) being connected, respectively, to the third outputs of the 2j-th and (2j+1)-th components, respectively, of the stage ED_(q−1), the third input of each component CD_(αj,q) is connected to the parameter α_(j,q) when the instruction is a left shift and to the complement α_(j,q)′ of the parameter α_(j,q) when the instruction is a right shift, for q greater than zero, the first and second outputs of each component CD_(αj,q) are connected to the fourth input of the (2j+1)-th and 2j-th components, respectively, of the stage E_(q−1), and for q equal to zero, the first and second outputs of each component CD_(αj,0) deliver the (2j+1)-th and 2j-th bits, respectively, of the code C_(rest−t).
 4. Microprocessor according to claim 2, wherein the arithmetic instruction is a shift instruction for shifting the bits of the data item D₁ 2^(r) bits to the left or to the right, where r is a whole number greater than one and less than NbE, and the code C_(rest−1) is equal to Q_(a)(D_(res−p)) in the absence of an execution fault, the security module comprising, to this end, a computation circuit for computing the code C_(rest−t) from the code C₁ and without using the result D_(res−p), this computation circuit comprising a block permutator and a bit scheduler, the permutator comprises, for each stage E_(q) of transpositions included in the group of stages E_(r) to E_(NbEt) that permute blocks of size greater than or equal to 2^(r), a stage ED_(q) comprising components CD_(αj,q), each component CD_(αj,q) being associated with a respective transposition T_(αj,q) of the stage E_(q), the components CD_(αj,q) of the stages ED_(r) to ED_(NbE−2) all being structurally identical to one another and each having a first, a second, a third and a fourth input and a first, a second and a third output: the first output delivering the result a.k′+c.k, where: “a”, “k” and “c” are the values received at the first, third and fourth inputs, respectively, of this component, the symbol “ + ” denotes the “OR” logic operation, the symbol “ . ” denotes the “AND” logic operation, and the symbol “ ′ ” denotes the “NOT” logic operation, the second output delivering the result b.k+c.k′, where “b” is the value received at the second input of this component, the third output delivering the result a.k+b.k′, the component CD_(αj,NbE−1) of the stage ED_(NbE−1) having a first, a second and a third input and a first and a second output: the first and second outputs delivering the results “a” and “0”, respectively, when the third input is equal to “0”, where “a” is the value received at the first input and “0” is the value zero, and the first and second outputs delivering the results “0” and “b”, respectively, when the third input is equal to “1”, where “b” is the value received at the second input, the first and second inputs of each component CD_(αj,r) of the stage ED_(r) being connected to the 2j-th block of 2^(r) bits and to the (2j+1)-th block of 2^(r) bits, respectively, of the code C₁, and for q greater than r, the first and second inputs of each component CD_(αj,q) of the stage ED_(q) being connected, respectively, to the third outputs of the 2j-th and (2j+1)-th components, respectively, of the stage ED_(q−1), the third input of each component CD_(αj,q) is connected to the parameter α_(j,q) when the instruction is a left shift and to the complement α_(j,q)′ of the parameter α_(j,q) when the instruction is a right shift, for q greater than r, the first and second outputs of each component CD_(αj,q) are connected to the fourth input of the (2j+1)-th and 2j-th components, respectively, of the stage E_(q−1) , and for q equal to r, the first and second outputs of each component CD_(αj,r) deliver the (2j+1)-th intermediate block BCI_(2j+1,r) of 2^(r) bits and the 2j-th intermediate block BCI_(2j,r) of 2^(r) bits, respectively, of an intermediate code CI_(rest−t), each block BCI_(x,r) being identical to a respective block BC_(y,r) of the code C₁, the order of the bits within this block BC_(y,r) having been obtained, during computation of the code C₁, by applying a first set of transpositions of the stages E_(r−1) to E₀ to a respective block BD_(z,r) of 2¹ bits of the data D₁, where the subscripts x, y and z are each an identifier of a position of a block of 2¹ bits in a data item or a code of 2^(d) bits, the subscript x being equal to 2j+1 if the order number of the block of 2^(r) bits is odd and equal to 2j otherwise, and the first set of transpositions being dependent on the position y, the scheduler capable of replacing each intermediate block BCI_(x,r) of the intermediate code with a block BCR_(x,r) of 2^(r) bits that is equal to the result of the application of a second set of transpositions of the stages E_(r−1) to E₀ to the block BD_(x,r) of 2 ^(r) bits of the data item D₁, the second set of transpositions being dependent on the position x.
 5. Microprocessor according to claim 4, wherein the scheduler is capable, for each block BCI_(x,r): of comparing each parameter of a current permutation key with the corresponding parameter of a desired permutation key, the current key containing the parameters of all of the transpositions of the first set of transpositions used, during computation of the code C₁, to permute the bits placed within the block BD_(z,r), the desired key containing the parameters of all of the transpositions of the second set of transpositions used, during a computation of F_(α)(D_(res−p)), to permute the bits placed within the block BD_(x,r), and, each time the compared parameters are different, of executing the transposition configured by this parameter on the bits of the block BCI_(x,r) and, each time the compared parameters are identical, of keeping the order of the bits of the block BCI_(x,r) unchanged.
 6. Microprocessor according to claim 2, wherein the arithmetic instruction is a rotation instruction for rotating the bits of the data item D₁ one bit to the left or to the right and the code C_(res−t) is equal to Q_(a)(D_(res−p)) in the absence of an execution fault, the security module comprising, to this end, a computation circuit for computing the code C_(rest-t) from the code C₁ and without using the result D_(res−p), this computation circuit comprising: for each stage E_(q) of the function F_(α), a stage ED_(q) comprising components CC_(αj,q), each component CC_(αj,q) being associated with a respective transposition T_(αj,q) of the stage E_(q), the components CC_(αj,q) of the stages ED₀ to ED_(NbE−2) all being structurally identical to one another and each having a first, a second, a third and a fourth input and a first, a second and a third output: the first output delivering the result a.k′+c.k, where: “a”, “k” and “c” are the values received at the first, third and fourth inputs, respectively, of this component, the symbol “ + ” denotes the “OR” logic operation, the symbol “ . ” denotes the “AND” logic operation, and the symbol “ ′ ” denotes the “NOT” logic operation, the second output delivering the result b.k+c.k′, where “b” is the value received at the second input of this component, the third output delivering the result a.k+b.k′, the component CC_(αj,NbE−1) of the stage ED_(NbE−1) having a first and a second input and a first and a second output, the first and second outputs delivering the results “a” and “b”, respectively, where “a” and “b” are the values received at the first and second inputs, respectively, and for q equal to zero, the first and second inputs of each component CC_(αj,0) of the stage ED₀ being connected to the 2j-th bit and to the (2j+1)-th bit, respectively, of the code C₁, and for q greater than zero, the first and second inputs of each component CC_(aj,q) of the stage ED_(q) being connected, respectively, to the third outputs of the 2j-th and (2j+1)-th components, respectively, of the stage ED_(q−1), the third input of each component CC_(αj,q) is connected to the parameter α_(j,q) when the instruction is a left rotation and to the complement α_(j,q)′ of the parameter α_(j,q) when the instruction is a right rotation, for q greater than zero, the first and second outputs of each component CC_(αj,q) are connected to the fourth input of the (2j+1)-th and 2j-th components, respectively, of the stage E_(q−1), and for q equal to zero, the first and second outputs of each component CC_(αj,0) deliver the (2j+1)-th and 2j-th bits, respectively, of the code C_(rest−t).
 7. Microprocessor according to claim 2, wherein the arithmetic instruction is a shift instruction for shifting the bits of the data item D₁ a_(NbE−1)2^(NbE−1)+ . . . +a_(q)2^(q)+ . . . +a₀ bits to the left or to the right, where the coefficients a_(q) are predetermined coefficients, and the security module comprises a computation unit for computing the code C_(rest−t) from the code C₁ and without using the result D_(res−p), this code C_(rest−t) being identical to the result D_(res−p) in the absence of an execution fault during the execution of this shift instruction by the arithmetic and logic unit, this computation circuit comprising, for each stage E_(q) of the function F_(α), a stage ET_(q), each stage ET_(q) having: a first input for receiving a code to be permuted, this code to be permuted being formed by a juxtaposition of blocks BCP_(y,q) of 2^(q) bits each, the subscript y being the order number of the block of 2^(q) bits in this juxtaposition, this first input receiving the code C₁ when q=NbE−1, a second input for receiving the parameters α_(m,q) to α_(0,q) of the transpositions of the stage E_(q), this second input receiving the parameters α_(0,NbE−1) when q=NbE−1, a third input for receiving a permutation key for the next stages, this permutation key being formed by a juxtaposition of blocks BK_(y,q) of 2^(q) bits each, each block BK_(y,q) containing only the parameters of the transpositions of the next stages E_(chi) to E₀ that permute the bits placed within the block BCP_(y),_(q) of the code received at the first input, this third input receiving the private key a of the parameter α_(0,NbE−1) when q=NbE−1, a first permutator capable of performing the permutation T_(αm,q) o . . . o T_(αj,q) o . . . o T_(α1,q) o T_(αo,q)(x) in order to obtain a permuted code, where the variable x is equal to the code to be permuted that is received at the first input and the permutation parameters α_(m,q) to α_(0,q) are those received at the second input, a first shift register capable of performing a shift of 2^(q) bits for the permuted code in order to obtain a permuted and shifted code, a first multiplexer configured to select the permuted code if the coefficient a_(q) is equal to “0” and to select the permuted and shifted code if the coefficient a_(q) is equal to “1”, this first multiplexer having an output that delivers to the first input of the next stage ET_(q−1) the code selected from the permuted code and the permuted and shifted code, a second permutator capable of executing the permutation T_(αm,q) o . . . o T_(αj,q) o . . . o T_(α2,q) o T_(α0,q)(k) in order to obtain a permuted key, where the variable k is equal to the permutation key received at the third input, a second shift register capable of performing a shift of 2^(q) bits for the permuted key in order to obtain a permuted and shifted key, the first and second shift registers shifting the bits to the left if the shift instruction is a shift instruction for shifting the bits to the left and, otherwise, shifting the bits to the right if the shift instruction is a shift instruction for shifting the bits to the right, a second multiplexer configured to select the permuted key if the coefficient a_(q) is equal to “0” and to select the permuted and shifted key if the coefficient a_(q) is equal to “1”, this second multiplexer having: a first output that delivers, to the second input of the next stage ET_(q−1), the parameters α_(m,q−1) to α_(0,q−1) extracted from predetermined locations of the key selected from among the permuted key and the permuted and shifted key, and a second output that delivers, to the third input of the next stage ET_(q−1), a permutation key for the next stages comprising the parameters α_(m,q−2) to α_(0,0) extracted from predetermined locations of the key selected from among the permuted key and the permuted and shifted key, the permutation key delivered at the second output being formed by a juxtaposition of blocks BK_(y,q−1) of 2^(q−1) bits each, each block BK_(y,q−1) containing the parameters of the transpositions of the next stages E_(q−2) to E₀ to be applied to the block BCP_(y,q−1) of the data item received at the first input of the next stage ET_(q−1), and during operation 4), the microprocessor is configured to compare the code C_(res−t) with the result D_(res−p) and to trigger the signalling of a fault only if they are not identical.
 8. Microprocessor according to claim 2, wherein the arithmetic instruction is an addition instruction for adding the data item D₁ to a data item D₂, and the security module comprises a computation circuit for computing the code C_(rest−t) from the codes C₁ and C₂ and without using the result D_(res−p), this computation circuit comprising: an adder stage comprising, for each pair of bits of the codes C₁ and C₂ to be added, an adder AD_(p), where the subscript p is the order number of the two bits of the codes C₁ and C₂ to be added by this adder, this order number varying from zero to 2 ^(d)−1, each adder AD_(p) having: a first, a second and a third input, the first and second inputs being intended to receive the bit of the code C₁ and the bit of the code C₂ to be added, respectively, a first output delivering the result a.b, where: “a” and “b” are the values received at the first and second inputs, respectively, and the symbol “ . ” denotes the “AND” logic operation, a second output delivering the result a+b, where the symbol “+ ” denotes the “OR” logic operation, a third output delivering the result a XOR b XOR c, where “c” is the value received at the third input and XOR is the symbol of the “EXCLUSIVE-OR” logic operation, the concatenation of the bits delivered at the third outputs of the various adders AD_(p) of the addition stage forming the code C_(res−t), a carry look-ahead unit having an input at which the permutation key a is received, this carry look-ahead unit being capable of determining the values of the third inputs of the adders of the adder stage that need to have a carry delivered to them from the results delivered at the first and second outputs of each adder and from the parameters α_(j,q), of the key a received at its input.
 9. Microprocessor according to claim 8, wherein: the carry look-ahead unit comprises, for each stage E_(q) of the function F_(α), a stage EA_(q) comprising components CA_(αj,q), each component CA_(αj,q) being associated with a respective transposition T_(αj,q) of the stage E_(q), the components CA_(αj,q) of the stages EA₀ to EA_(NbE−2) all being structurally identical to one another and each having a first, a second, a third, a fourth, a fifth and a sixth input and a first, a second, a third and a fourth output: the first output delivering the result (G_(r)+P_(r).G_(I)).k+(G_(I)+P_(I).G_(r)).k′, where: G_(r), P_(r), G_(I), P_(I) and k are the values received at the first, second, third, fourth and sixth inputs, respectively, of this component, the symbol “ + ” denotes the “OR” logic operation, the symbol “ . ” denotes the “AND” logic operation, and the symbol “ ′ ” denotes the “NOT” logic operation, the second output delivering the result P_(I). P_(r), the third output delivering the result c_(i).k′+(G_(I)+P_(I).c_(i)).k, where c_(i) is the value received at the fifth input, the fourth output delivering the result c_(i).k+(G_(r)+P_(r). C_(i)_.k′, the component CA_(αj,NbE−1) of the stage EA_(NbE−1) has a first, a second, a third, a fourth, a fifth and a sixth input and a first and a second output: the first output delivering the result c_(i).k′+(G_(I)+P_(I).c_(i)).k, where c_(i) is the value received at the fifth input, the second output delivering the result c_(i).k+(G_(r)+P_(r).c_(i)).k′, the first and second inputs of each component CA_(αj,0) of the stage EA₀ are connected to the first and second outputs, respectively, of the adder AD_(2j), where the adder AD_(2j) is the adder AD_(p), the subscript p of which is equal to 2j, the third and fourth inputs of each component CA_(αj,0) of the stage EA₀ are connected to the first and second outputs, respectively, of the adder AD_(2j+1), where the adder AD_(2j+1) is the adder AD_(p), the subscript p of which is equal to 2j+1, the third and fourth outputs of each component CA_(αj,0) of the stage EA₀ are connected, respectively, to the third inputs of the adders AD_(2j) and AD_(2j+1), respectively, for q greater than zero, the first and second inputs of each component CA_(αj,q) of the stage EA_(q) are connected to the first and second outputs, respectively, of the component CA_(a2j,q−1) of the stage EA_(q−1), for q greater than zero, the third and fourth inputs of each component CA_(αj,q) of the stage EA_(q) are connected to the first and second outputs, respectively, of the component CA_(αa(2j+1),q−1) of the stage EA_(q−1), for q greater than zero, the third and fourth outputs of each component CA_(αj,q) of the stage EA_(q) are connected, respectively, to the fifth inputs of the components CA_(α2j,q−1) and CA_(α(2j+1),q−1), respectively, the sixth input of each component CA_(αj,q) is capable of receiving the parameter α_(j,q).
 10. Microprocessor according to claim 1, wherein: the arithmetic and logic unit is capable of executing a logic instruction that, when executed, causes a Boolean operation D₁&D₂& . . . &D_(n) to be performed and the result of this Boolean operation to be stored in the register R_(res−p), where the “&” symbol denotes the Boolean operation, and the microprocessor is configured to perform operations 1) to 4) for this logic instruction and, during the execution of operation 3), to compute the code C_(res−t) using the following relationship: C_(res−t)=C₁ & C₂ & . . . &C_(n).
 11. Microprocessor according to claim 1, wherein the arithmetic operation is chosen from the group made up of a bit shift, a bit rotation and an addition. 